Using sequence mining techniques for performance data

نویسنده

  • Todd Gamblin
چکیده

For this project, we attempted to evaluate the effectiveness of rudimentary sequence mining techniques for characterizing I/O trace data. We have taken trace information for a scientific application running on a cluster, and we have attempted to use K-Medoids based clustering algorithms to correlate particular trace sequences with phases of application execution. We also present a novel approach to sequence comparison, where we consider packets in sequences qualitatively based on high-level observations about the distribution of their durations. We show that our approach can reveal correlations between I/O sequences and phases of application execution, and that sequence mining techniques hold promise for adaptive performance monitoring.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Diagnosis of diabetes by using a data mining method based on native data

Background & Aim: Detecting the abnormal performance of diabetes and subsequently getting proper treatment can reduce the mortality associated with the disease. Also, timely diagnosis will result in irreversible complications for the patient. The aim of this study was to determine the status of diabetes mellitus using data mining techniques. Methods: This is an analytical study and its databas...

متن کامل

Prediction of Student Learning Styles using Data Mining Techniques

This paper focuses on the prediction of student learning styles using data mining techniques within their institutions. This prediction was aimed at finding out how different learning styles are achieved within learning environments which are specifically influenced by already existing factors. These learning styles, have been affected by different factors that are mainly engraved and found wit...

متن کامل

Detecting Diseases in Medical Prescriptions Using Data Mining Tools and Combining Techniques

Data about the prevalence of communicable and non-communicable diseases, as one of the most important categories of epidemiological data, is used for interpreting health status of communities. This study aims to calculate the prevalence of outpatient diseases through the characterization of outpatient prescriptions. The data used in this study is collected from 1412 prescriptions for various ty...

متن کامل

Estimation of Punching Shear Capacity of Concrete Slabs Using Data Mining Techniques

Punching shear capacity is a key factor for governing the collapsed form of slabs. This fragile failure that occurs at the slab-column connection is called punching shear failure and has been of concern for the engineers. The most common practice in evaluating the punching strength of the concrete slabs is to use the empirical expressions available in different building design codes. The estima...

متن کامل

Integrating AHP and data mining for effective retailer segmentation based on retailer lifetime value

Data mining techniques have been used widely in the area of customer relationship management (CRM). In this study, we have applied data mining techniques to address a problem in business-to-business (B2B) setting. In a manufacturer-retailer-consumer chain, a manufacturer should improve its relationship with retailers to continue its business. Segmentation is a useful tool for identifying groups...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005